Search CORE

6 research outputs found

Controlling speculative execution through a virtually ordered memory system

Author: McWha David J. A.
Publication venue: The University of Waikato
Publication date: 29/09/2020
Field of study

Processors which extract parallelism through speculative execution must be able to identify when mis-speculation has occurred. The three places where mis-speculation can occur are register accesses, control flow prediction and memory accesses. Controlling register and control flow speculation has been well studied, but no scalable techniques for identifying memory dependence violations have been identified. Since speculative execution occurs out of order this requires tracking the causal order, as well as the addresses of memory accesses. This thesis uses simulations to investigate tracking the causal order of memory accesses using explicit tags known as virtual timestamps, a distributed and scalable method. Realizable virtual timestamps are necessarily restricted in length and it is demonstrated that naive allocation schemes seriously constrain execution by inefficiently allocating virtual timestamps. Efficiently allocating virtual timestamps requires analysis of the number required by each section of code. Basic statically and dynamically evaluated analysis methods are established to avoid virtual timestamp allocation becoming a resource bottleneck. The same analysis is also used to efficiently allocate state saving resources in a fixed hardware order. The hardware order provides an alternative way of maintaining the causal order using a simple hardware organization. The ability to predict the resources required by regions of code is used as a way of selecting instructions to execute speculatively. This enables resources to be allocated efficiently and is shown to allow large amounts of parallelism to be extracted. It also promotes the effectiveness of speculative execution by issuing less instructions that will ultimately be rolled back. Using a hierarchy of hardware ordering modules, themselves ordered by explicit virtual timestamps, a scalable ordering system is proposed. This hierarchy forms the basis of a twisted memory system, a multiple version memory system capable of identifying speculative memory dependence violations. The preliminary investigations presented here show that twisted memory has the potential to support aggressive speculative parallel execution. Particular attention is paid to memory bandwidth requirements

Research Commons@Waikato

Applying Time Warp to CPU Design

Author: J. A. David Mcwha
J. A. David Mcwha
John G. Cleary
John G. Cleary
Murray Pearson Richard
Murray W. Pearson
Richard H. Littin
Richard H. Littin
Publication venue: IEEE
Publication date
Field of study

This paper exemplifies the similarities in Time Warp and computer architecture concepts and terminology, and the continued trend for convergence of ideas in these two areas. Time Warp can provide a means to describe the complex mechanisms being used to allow the instruction execution window to be enlarged. Furthermore it can extend the current mechanisms, which do not scale, in a scaleable manner. The issues involved in implementing Time Warp in a CPU design are also examined, and illustrated with reference to the Wisconsin Multiscalar machine and the Waikato WarpEngine. Finally the potential performance gains of such a system are briefly discussed. 1. Introduction Computer designers currently face a very interesting set of challenges. The steady increase in the number of transistors on a chip and the speed at which a chip can be clocked has continued its inexorable progress. In 1997, chips with millions of transistors and clock speeds of hundreds of MHz are in routine production an..

CiteSeerX

Constraints on Parallelism Beyond 10 Instructions Per Cycle

Author: J. A. David Mcwha
John Cleary
Murray W. Pearson
Richard H. Littin
Publication venue
Publication date
Field of study

The problem of extracting InstructionLevel Parallelism at levels of 10 instructionsper clock and higher is considered. Two different architectures which use speculation on memory accesses to achieve this level of performance are reviewed. It is pointed out that while this form of speculation gives high potential parallelism it is necessary to retain execution state so that incorrect speculation can be detected and subsequently squashed. Simulation results show that the space to store such state is a critical resource in obtaining good speedup. To make good use of the space it is essential that state be stored efficiently and that it be retired as soon as possible. A number of techniques for extracting the best usage from the available state storage are introduced. Keywords: instruction level parallelism, speculation 1 Introduction Increasingly computer architects and system designers are seeking to extract more computer performance by making use of parallelism. There are a number of ..

CiteSeerX

Space Constraints on High Levels of ILP

Author: J. A. David Mcwha
John Cleary
Murray W. Pearson
Richard H. Littin
Publication venue
Publication date
Field of study

. ILP is one way of effectively using the large number of transistors available on modern CPUs. Two different architectures which use speculation on memory accesses to do this are reviewed. While this form of speculation gives high potential parallelism, it is necessary to retain execution state so that the incorrect speculation can be detected and subsequently squashed. It is shown by theoretical arguments and simulation that the space to store such state is a critical resource in obtaining good speedup. The state must be stored efficiently and retired as soon as possible. It is also shown that larger problem sizes may achieve lower extracted parallelism, despite having a higher potential parallelism. 1 Introduction Increasingly computer architects and system designers seek to extract more computer performance by making use of parallelism. There are a number of ways of approaching this. This paper considers the problem of extracting instruction level parallelism (ILP), that is, paral..

CiteSeerX

Estimation of the Underlying Burden of Pertussis in Adolescents and Adults in Southern Ontario, Canada

Author: AE Tozzi
AM Wendelboe
AM Wendelboe
Ashleigh A. McGirr
Ashleigh R. Tuite
C Smith
CH Wirsing von Konig
David N. Fisman
DP Greenberg
E Weir
ES Bamberger
FA Lievano
G De Serres
Gerardo Chowell
H Broutin
HE de Melker
HJ Schmitt
HJ Wearing
HT Nguyen
J Mossong
J Taranger
JC Blackwood
JD Cherry
JD Cherry
JI Ward
JS Lavine
JS Lavine
K Winter
KR Broder
L McWha
M Van Boven
M van Boven
MM Brinig
MP Preziosi
N Arinaminpathy
N Wood
NS Crowcroft
PE Fine
RM Anderson
RW Sutter
SA Halperin
SA Halperin
SE Raguckas
SS Long
SW Wright
WA Orenstein
Publication venue: 'Public Library of Science (PLoS)'
Publication date
Field of study

Crossref